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ABSTRACT 

The theoretical prediction that trigonometric parallaxes suffer from a statistical 
effect, has become topical again now that the results of the Hipparcos satellite have be- 
come available. This statistical effect, the so-called Lutz-Kelker bias, causes measured 
parallaxes to be too large. This has the implication that inferred distances, and hence 
inferred luminosities are too small. Published analytic calculations of the Lutz-Kelker 
bias indicate that the inferred luminosity of an object is, on average, 30% too small 
when the error in the parallax is only 17.5%. Yet, this bias has never been determined 
empirically. In this paper we investigate whether there is such a bias by comparing 
the best Hipparcos parallaxes which ground-based measurements. We find that there 
is indeed a large bias affecting parallaxes, with an average and scatter comparable 
to predictions. We propose a simple method to correct for the LK bias, and apply 
it successfully to a sub-sample of our stars. We then analyze the sample of 26 'best' 
Cepheids used by Feast & Catchpole (1997) to derive the zero-point of the Period- 
Luminosity relation. The final result is based on the 20 fundamental mode pulsators 
and leads to a distance modulus to the Large Magellanic Cloud - based on Cepheid 
parallaxes- of 18.56 ± 0.08, consistent with previous estimates. 
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1 INTRODUCTION 

Although discussed as early as 1953 (Trumpler & Weaver), 
Lutz & Kelker (1973) were the first to quantify the bias in 
the absolute magnitude of a star estimated from its observed 
trigonometric parallax. The principle of the Lutz-Kelker bias 
(hereafter LK bias) is relatively easy to understand. A given 
parallax, ir with a measurement error ov yields a distance 
d with an upper and a lower bound. Stars at a smaller dis- 
tance and stars located further away can both scatter to 
the observed distance. Since there are more stars outside 
than inside the distance range -simply because of the differ- 
ent sampled volumes- more stars from outside the distance 
range will scatter into the distance range than those inside. 
This effect causes a systematic bias such that measured par- 
allaxes will on average yield too small distances. 

The magnitude of the bias can be calculated analyti- 
cally. Assuming a uniform distribution of stars, LK found 
that the mean correction to the derived absolute magnitude 
increases with increasing relative error, reaching -0.43 mag. 



Based on data from the ESA Hipparcos astrometry satellite. 



for a 17.5% error in the parallax. The correction itself is 
considerable, but since we deal with a statistical process de- 
scribing the numbers of stars scattering inside and outside 
the allowed distance range, the bias is represented by a prob- 
ability distribution. This was investigated by Koen (1992), 
who calculated the 90% confidence intervals of the correc- 
tion for the bias. He found for the case of a 17.5% error in 
the parallax (almost a 6<r detection) the same correction as 
Lutz & Kelker, but derived that the 90% confidence interval 
ranges from +0.33 to -1.44 mag. This is to be compared 
with the observational error of 0.4 mag. based only on the 
propagation of the error in the parallax. Such a correction is 
remarkable indeed, and bears the consequence that parallax 
measurements should be corrected for this effect, or strin- 
gent selection criteria in terms of quality of the data should 
be taken, before the astrophysical interpretation of the data 
can be performed. 

Indeed, it appears that the only way to take into ac- 
count the bias before deriving astrophysical parameters from 
parallaxes is to use the tables by Koen (1992) where the cor- 
rection is given as function of the relative error in the par- 
allax, (ov/7r). However, as has been pointed out by Smith 
(1987c), the LK correction to an individual parallax mea- 
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Figure 1. Upper panel: M par as function of it. The ground based 
parallaxes are listed with integer values. The solid lines give an 
indication to the zone where we do not expect any data (see text) . 
Middle and lower panel: AM as function of V and 7r. Although 
there appears no correlation with V, there is a strong correlation 
with parallax. 



surement of a star, or to a sample of stars is valid when 
no a priori additional information, like proper motions or 
knowledge on the intrinsic absolute magnitude distribution, 
is available. In the case of a Gaussian or top-hat intrinsic 
magnitude distribution, Smith (1987a+b+c; see also Turon 
Lacarrieu & Creze 1977) showed that the correction is a 
function of the true absolute magnitude, Mo, the intrinsic 
spread in Afo, o~m , the observed parallax 7r, and the error 
on the observed parallax a n , and not a simple function of 
(<T7r/7r). For a magnitude- limited sample (complete in par- 
allax to a certain limiting magnitude), this correction, ac- 
cording to Smith (1987c), approaches zero because the com- 
bination of the Malmquist bias and the LK bias lead to a 
symmetric error in magnitude. For other samples, the cor- 
rections reach asymptotically the LK values, which apply 
when nothing about Mo or <tm is known. 

Considering the implications of the above and the fact 
that the presence of the bias has actually never been estab- 
lished empirically, we devised a simple empirical test using 
ground-based and Hipparcos data. 



2 A STATISTICAL BIAS IN PARALLAXES 

We investigate here whether there is any change in the ab- 
solute magnitude determined from the measured parallax n 
of a star as a function of the error in the parallax (0-^/71-). 
If there would be a trend towards too faint absolute magni- 
tudes with larger (a^/ir), or even a large spread, then it is 
likely that an LK bias is present. 

The Hipparcos catalogue is a good starting point to 
perform such a test. It is hard however, to find a sample of 
stars for which we know their absolute magnitudes from first 



principles. For example, selecting stars with the same spec- 
tral type will not be sufficient, as such a sample will have 
a large spread in intrinsic magnitudes, will inevitably suffer 
from misclassifications and have completeness problems. On 
the other hand, when stars with extremely good Hipparcos 
parallaxes are considered, one may reasonably assume that 
their distances are well determined. If one then compares 
such a sample of stars with their (poorer quality) ground 
based parallaxes, it is possible to investigate the LK bias. 
We therefore selected all stars in the Hipparcos Input Cata- 
logue (1993) with trigonometric parallax measurements < 
(cr,r/7r)g round _bascd < 1- From the remaining 4007 objects, 
we selected those stars in the Hipparcos Catalogue with: 

(i) < (ov/7r)ffip < 0.05, i.e. a 20a detection or better 

(ii) Number of rejected data < 10% (Field H29) 

(iii) Goodness-of-fit smaller than 3 (Field H30) 

The first criterion ensures us that we have a sample for which 
we may hope to assume that the data do not suffer from sig- 
nificant bias-problems (if Koen (1992) is correct, the mean 
bias is at most 0.025 mag., with 90% confidence limits ^> 
0.2 mag.) while the latter two criteria ensure that the data 
are not hampered by observational problems as discussed 
in the accompanying literature to the Hipparcos database. 
These three extra selection steps left us with a sample of 
2187 stars. 

The data are plotted in Figure [l]. The upper panel shows 
the inferred absolute visual magnitude, derived from the 
ground-based parallax, M par , plotted against the ground- 
based parallax. The stars follow a well-defined band in the 
plot. This is easily understood. If one takes the analogy for 
one object with a measured V band magnitude, it can only 
follow a straight line when its derived intrinsic magnitude 
is plotted as function of parallax. Consequently, all objects 
should lie between the lines defined by the brightest and 
faintest V magnitude in the sample. One can also say that 
the faintest V magnitude in the sample defines a minimum 
possible value of the parallax for a given value for M par . 
The difference between the (measured) absolute magnitude 
of an object and the limiting (i.e. faintest) V magnitude of 
a sample implies a maximum possible distance, and thus 
minimum parallax. This can be written as: 



7T > 



1000 x 10(°- 2 < A/ p.— -v»«-5)) 



(1) 



where Kn ax is the faintest magnitude of the sample. A simi- 
lar relation can be written for the brightest (minimum) mag- 
nitude in the sample. The resulting boundaries are indicated 
by solid lines in the upper panel in Fig. [jj the dashed lines 
represent V — 2 and 10, encompassing the bulk of the sam- 
ple. The middle panel shows the difference (M par - Mo), [cal- 
culated from 51og(7rHipp a rcos/7Tground-b a sod), hereafter AM] 
as function of V. AM shows a large scatter, especially for 
faint V, but the (unweighted) mean AM = -0.01 ± 0.8, 
which would appear rather reassuring. The propagated er- 
rors are not plotted, but these are indicated in Fig. ^| 

In the lower panel, AM is plotted against 7r. A strong 
correlation is present. For large ir, AM is close to zero - 
indicating good distance determinations-, followed by an in- 
crease of the spread in values until AAI decreases towards 
brighter absolute magnitudes. There are no stars present in 
the upper left hand corner of the plot. This is not a real ef- 
fect, but rather a completeness effect in the data. In reality 
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Figure 2. AM as function of (an/ir). The triangles with error- 
bars indicate the propagated errors in magnitudes from parallaxes 
as function of (a^/n). The large solid circles indicate the average 
AM binned over intervals of 0.1 in (a^/ir). The thick lines indi- 
cate the LK bias calculated by Koen (1992) for a uniform density 
of stars. The dashed lines indicate the 90% confidence intervals. 



such objects are further away than expected, and too faint 
to be included in the sample. This zone is effectively forbid- 
den, as is illustrated by the solid lines in the upper panel, 
where the lower left-hand corner is void. 

Although completeness effects play an important role, 
the resulting distribution of AM is very wide, and illustrates 
that absolute magnitude estimates based on trigonometric 
parallaxes are subject to large scatter, with a range of -3 to 
3 magnitudes. 

Figure ^ shows AM plotted against (ov/V). Since the 
range in ov is much smaller than the range in ir (the average 
ov is 7.9 ± 2.5 mas), the trends are roughly the same as in 
the (AM - 7r) relation of Figure 1. The filled circles indicate 
the average AM, with their standard deviation binnned over 
intervals of 0.1 in (ov/tt). These data are compared with the 
results of Koen (1992), in the case of a uniform density of 
stars and an infinite number of measurements. The thick 
solid line indicates the mean bias calculated by Koen, and 
the thick dashed lines represent the 90% confidence inter- 
vals of the bias correction. For a smaller number of mea- 
surements, the bias and the corrections are larger, while for 
a decreasing stellar density (Koen's p = 2 case), the correc- 
tions themselves are somewhat smaller, but the confidence 
intervals are similar. As most of the stars in our sample are 
located within 100 pc (see Fig. |l|) , the assumption of a uni- 
form stellar density is probably close to the real situation. 

For (ov/7r) < 20 %, the observations and the predic- 
tions by Koen agree very well, after that, the average AM 
reaches 0, and goes towards too bright magnitudes when 
(a^/n) increases further. It is due to the objects with large 
(a^/n) (which are affected by the completeness of the sam- 
ple) that the overall mean AM is close to zero. If a selection 
on parallax, or (ov/vr), had been applied to these data, the 
resulting mean M par would have led to too faint mean in- 
trinsic magnitudes. Unfortunately, our sample renders the 
construction of a magnitude limited sample not possible. It 
would thus appear that there is indeed an LK-type bias in 
parallax data. In the following, we will concentrate on the 
correction for the LK-bias. 



2.1 A correction for the Lutz-Kelker bias 

One of the assumptions in the analysis by Lutz-Kelker 
(1973) and Koen (1992) is that the absolute magnitude 
and its spread of a sample of stars are unknown. Smith 
(1987b+c) and Turon Lacarrieu & Creze (1977) consider 
the case of a luminosity function with a Gaussian spread 
(for a uniform distribution of stars) and derived a formal- 
ism for the correction of the LK bias. This correction turns 
out to be a function of A/o, crjv/ , and (a^/n), and converges 
to the LK value for large <jm . A linear approximation, valid 
when [AM| < 2.17, to this rigorous correction was derived 
by Smith (1987c): 



5M : 



2 



(7^+4.715(^/77) 



(Mo - M pa 



(2) 



For large o"m , the linear approximation becomes zero, the 
exact value of the rigorous correction is close to the LK value 
(Smith 1987c). 

To appreciate the usefulness of this particular result and 
to see whether the correction can be used as a means to de- 
rive the 'true' intrinsic magnitude of a sample of stars, we 
select a sub-sample from our sample of objects. To mimic a 
sample of stars with approximately the same mean intrin- 
sic magnitude, we restrict ourselves to a narrow range of 
true absolute magnitudes, and apply the correction given in 
Eq. ^. There are 313 objects present in the intrinsic mag- 
nitude (i.e. derived from the Hipparcos parallaxes) bin 4 < 
Mo< 5. We took the mean Mo, and its standard deviation 
(4.49 ± 0.28 mag.) as input values for Eq. [| The results are 
plotted in Fig. fa. The upper panel shows AM as function of 
{(Ttt/ti), which is similar as for the larger sample depicted in 
Fig. |^. The corrected values are shown in the lower panel, 
and the mean intrinsic magnitude appears to be retrieved. 

Hence, it is possible to correct a sample of objects for 
the LK-bias. It would seem that this correction for the sta- 
tistical biases is not very useful since one has to know the 
answer already before applying the correction. However, if 
one studies a sample of objects of which one may assume 
that they all have the same Mo, it may be the basis for a 
powerful method to derive the absolute magnitude. To il- 
lustrate this, Fig. ^ also shows the resulting AM for other 
values of Mo in Eq. |^. The upper cloud of points was ob- 
tained for inserting Mo = 8.5 in the equation, while the 
lower cloud of points were calculated using Mo = 0.5. For 
{(Ttt/ti) 5s 0.4, a strong dependence on the input value of Mo 
is present. It appears then, that the correct value of Mo can 
be obtained iteratively by varying the input value of Mo to 
obtain a horizontal line, or a minimum spread around 0. Fol- 
lowing a procedure outlined later in more detail, we found 
the best value for the intrinsic magnitude of the sample to 
be in the range 4.48 - 4.59 when small (< 0.1), respectively 
large (> 0.5), values of the (assumed to be unknown) spread 
in intrinsic magnitude are used. 

The above exercise shows that a correction for the LK- 
bias is not a simple function of (0^/71-). In the case of an indi- 
vidual object of which nothing is known, the LK correction, 
along with its large confidence interval is the only remaining 
option. However, for a sample of which all stars are known 
to have the same intrinsic magnitude, the LK correction as 
determined by Smith (1987c) is a potentially powerful tool 
to determine Mq. As shown above, this method returns a 
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Figure 3. Upper panel: as the previous figure, now for objects 
with 4 < Mq< 5. Lower panel: AM with M par corrected using 
Eq. ^. The upper and lower clouds of points respectively show the 
dependence of the solution when a different Mo is used (Mg= 8.5 
and 0.5 respectively instead of 4.49). 



surprisingly good result on our sub-sample of objects. In 
the following we will apply this to a sample of Cepheids. 



3 THE CEPHEID DISTANCE SCALE 

Given the presence of a bias in parallax data described 
above, it is surprising that several studies based on Hip- 
parcos data, whilst not taking the LK bias into account, 
yield results that are relatively close to previous results. 
For example, the new zero-point of the Cepheid Period- 
Luminosity (PL) relation that Feast & Catchpole (1987, 
hereafter FC) found, increased the distance modulus to the 
Large Magellanic Cloud by (only) 0.2 ± 0.1 magnitude, and 
a decrease in the, Cepheid based, Hubble constant of about 
10%. Fortunately, we can construct a similar test for the 
Cepheids as described above. Contrary to the sample in the 
previous section, where the extremely good Hipparcos paral- 
laxes provided the true absolute magnitude, we now have to 
use another indicator. The true absolute magnitude of the 
Cepheids is assumed to follow the PL relation for Cepheids 
(cf. FC): 



< M v >= 8 logP + p 



(3) 



With p = -1.43 ± 0.10 and 8 = -2.81 (as derived and as- 
sumed respectively by FC). The Hipparcos parallaxes, com- 
bined with the reddening corrected < V > magnitudes then 
give M par . We took the data of the 26 'best' Cepheids which 
contributed the largest weight (87% from a sample of 220 
objects) to the solutions from FC. The stars were corrected 
for reddening in the same way as in FC. 

The relation between AM and (a w /Tv) is plotted in 
Fig. ^| The same trend as in Fig. [| is present. For small 
(utt/tt), AM tends to indicate fainter intrinsic magnitudes 



2 

a(-n)/-n 



Figure 4. As the previous figure, now for Hipparcos measure- 
ments of 26 Cepheids. The true magnitude was calculated using 
the Period-Luminosity relation (Eq. H with p = —1.43). 



than predicted from the PL relation, whereas for large 
(a^/n), the Cepheids have too bright magnitudes by as much 
as 4 mag! The fact that the Cepheids with large (ov/V) are 
all too bright is due to the completeness effect discussed ear- 
lier. The reason that the FC zero-point was close to previous 
values is that the LK bias is canceled out by the complete- 
ness effects to a certain, but ill-defined degree. 

One can try to correct for the LK-bias using Eq. 0, 
provided we know all relevant parameters. A difference with 
the case illustrated in Fig. 3 is that the parameter we want 
to estimate is not Mo but p, the difference between M v and 
<51ogP. A second difference is that ctm (or equivalently in 
this case, a p ) is unknown. From the 26 Cepheids we consider 
the 20 fundamental mode pulsators, as indicated by FC. 
For this sample FC found a zero-point p = -1.49 ± 0.13 
(corresponding to a distance modulus to the LMC of 18.76). 
Three stars have values of AM larger than 2.17, and are 
hence not applicable to Eq. 2, and will be discarded in the 
analysis. The reason that we chose not to include the six first 
overtone pulsators is discussed below. We can now derive an 
improved value for the zero point of the Cepheid PL relation 
applying the LK-correction to the parallaxes. 

The basic principle behind the following procedure is 
to iteratively vary p until the variance around AM = is 
minimal. We calculated for a range of values of p the 'true' 
Mo and p par from p par = M par - <5 log P for every star. The 
quantity Ap = (p par (corrected) - p) is calculated with p par 
(corrected), the value of p par after applying Eq. 2. Calcu- 
lated are the mean and standard deviation in Ap and Q 2 



= (J](Ap/s) 2 )/(7V - 1). For s we assumed a p /VN, with 
N = 17 the number of stars in the sample. The best value 
of p is found where Q 2 has a minimum, \ 2 ■ The la uncer- 
tainty around the best value is estimated from those values 
of p for which Q 2 = % 2 + !• The results are presented in 
Table [IJ where also the standard deviation in the mean of 
Ap is listed. 

A complication is that the spread a p is unknown. Ob- 
viously, if we use a p — 0, all corrections will be equal to the 
observed AM. All input values of p would yield equally low 
values of Q 2 , and hence the error estimate in p based on the 
variation of x 2 would be very large. The best value for p is 
almost equal for every adopted a p . We adopt a final value 
for p = -1.29 ± 0.02, with the error based on the scatter in 
the best fitting values of p. This means a decrease in p of 
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Table 1. Zero-point p of Cepheids for the 17 best stars 
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Figure 5. As previous figure, but now for the fundamental mode 
Cepheids, after correction for the LK-bias, for p = -1.29. The open 
circles represent the three stars not used in the analysis because 
they have AM > 2.17 (see Fig. 4). The six crosses represent the 
first overtone pulsators. Note the change in vertical axis compared 
to the previous figure. 



0.20 with respect to FC for exactly the same sample and - 
all things being equal - an LMC distance modulus of 18.56. 

The extent to which this procedure has compensated 
for the bias is illustrated in Fig. 5, where Ap is plotted 
against (ov/tt) for the case p = -1.29, a p — 0.175 which has 
a standard deviation around the mean equal to the adopted 
uncertainty. Note that all 20 fundamental pulsators are plot- 
ted, although the fitting was done excluding the three stars 
with initially the largest AM. The figure also illustrates the 
reason why we chose not to include the six first overtone 
pulsators in the analysis. Three of them give significantly 
higher residuals. This may be taken as evidence that the 
procedure by FC to transform first overtone to fundamental 
mode pulsation period (Eqs. 8 and 9 in FC) introduces ad- 
ditional noise. As FC, we keep the value of 5 fixed. Taking 
the error into account will increase the uncertainty in the 
zero-point. Repeating the analysis of FC we find for their 
sample of non-overtone pulsators that changing S by ±0.06 
(the uncertainty in the slope derived by Caldwell & Laney 
1991 and adopted by FC) changes their zero-point by 0.05. 
We performed the same test for our procedure, and find that 
changing S by ± 0.06 results in a shift of the best-fitting p 
of ± 0.05. Taking into account errors in p (± 0.02), in 8 (± 
0.05) and in the visual dereddened magnitudes (± 0.06) our 
best estimate based on a re-analysis of the Cepheid sample 
of FC is 18.56 ± 0.08. This value is more in agreement with 
previous determinations using Cepheids (see e.g. the listing 
by Madore & Freedman, 1997) and RR Lyrae variables (e.g. 
Alcock et al. 1997). 
© 0000 RAS, MNRAS 000, 000-000 



4 CONCLUDING REMARKS 

We have investigated whether there is a Lutz-Kelker type 
bias present in parallax data. For the derivation of the 'true' 
intrinsic magnitudes, 20cr parallaxes or better were taken 
from the Hipparcos data, and compared with less accu- 
rate ground-based parallaxes that were available for these 
stars. For small relative errors in the parallaxes, AM is dis- 
tributed a-symmetricly around zero, with a preference for 
too faint magnitudes, as is expected from an LK-type bias. 
The spread in values is consistent with the confidence in- 
tervals for the bias that were calculated by Koen (1992). 
For larger values of (ov/tt), the parallaxes tend to yield too 
bright magnitudes. This can be explained by completeness 
effects in the data, where the limiting magnitude of the sam- 
ple implies a lower limit to the observed parallaxes. A simple 
method to correct for the bias has been presented and tested. 
The Cepheid data of Feast & Catchpole (1997) were then in- 
vestigated, A re-analysis of these data, taking into account 
any biases, returns a value of the distance modulus of 18.56 
± 0.08, which is 0.14 magnitudes smaller than FC found, 
and in good agreement with previous determinations. 

Finally, we note that unless parallax measurements are 
extremely precise, the determination of astrophysical pa- 
rameters from these data will be affected by LK-type bi- 
ases. For the moment, either using extremely precise data, 
or taking into account the LK-bias, with its large confidence 
intervals, seems to be the only option for individual objects, 
while the simple correction proposed here, can be used to 
obtain more reliable estimates of the mean intrinsic magni- 
tude of a sample of stars, provided one knows beforehand 
that the objects have the same luminosity. The assumption 
of a uniform distribution of stars used throughout this paper 
may be improved upon by using Monte-Carlo calculations 
of the distribution of stars in the line of sight of the targets. 
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